Skip to content

add 4_0 to default outfile namestr dict #1031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 17, 2023
Merged

Conversation

cammytown
Copy link
Contributor

This came up when trying to convert the gpt4all-lora-unfiltered-quantized.bin file

I don't know if it was excluded on purpose.

This came up when trying to convert the gpt4all-lora-unfiltered-quantized.bin file
@sw
Copy link
Contributor

sw commented Apr 17, 2023

It's also missing from the description for --outtype.

According to the readme, you would use quantize if you wanted q4_0 or q4_1, right?

@cammytown
Copy link
Contributor Author

cammytown commented Apr 17, 2023

According to the readme, you would use quantize if you wanted q4_0 or q4_1, right?

Not sure, but this did work for my purposes in converting the file to something llamacpp can work with.

Let me know if I should add it to --outtype, I'm out of my depth and just did this so other people wouldn't run into the same issue.

Cheers.

Copy link
Collaborator

@prusnak prusnak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm this fixes my issue when running python3 convert.py models/gpt4all-7B/gpt4all-lora-quantized.bin

@prusnak
Copy link
Collaborator

prusnak commented Apr 17, 2023

Let me know if I should add it to --outtype, I'm out of my depth and just did this so other people wouldn't run into the same issue.

Yes, let's add it there for consistency and completeness.

@prusnak prusnak merged commit 4ad7313 into ggml-org:master Apr 17, 2023
jeroen-mostert pushed a commit to jeroen-mostert/llama.cpp that referenced this pull request Aug 30, 2024
…ml-org#1031)

The token immediately before an eot token was lost when SSE streaming
was enabled if that token was contained entirely within a stop sequence.
As an example of when this could happen, consider this prompt:
  Type the phrase 'pleas' once.
In a Llama 3-derived model, 'pleas' tokenizes as 'ple' 'as'. The token
'as' is contained within this instruct mode stop sequence:
  <|eot_id|><|start_header_id|>assistant<|end_header_id|>
due to the word 'assistant'. Since `string_contains_sequence_substring`
returns True for 'as', this token is added to `tokenReserve` instead of
being streamed immediately. If the '<|eot_id|>' token was generated
next, the text in `tokenReserve` would be discarded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants